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DETAILED ACTION 
Information Disclosure Statement 

The information disclosure statement (IDS) submitted on 3/29/2004 has been considered 
by the examiner. 

Claim Rejections - 35 USC §103 
The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set forth in 
section 102 of this title, if the differences between the subject matter sought to be patented and the prior art are 
such that the subject matter as a whole would have been obvious at the time the invention was made to a person 
having ordinary skill in the art to which said subject matter pertains. Patentability shall not be negatived by the 
manner in which the invention was made. 

Claims 1-13 are rejected under 35 U.S.C. 103(a) as being unpatentable over Trumbull 
U.S. Patent No. 6,769,771 (hereinafter Trumbull), in view of Endo et al. U.S. Patent No. 
6,335,754 (hereinafter Endo), Hoch U.S. Patent No. 6,850,250 (hereinafter Hoch) and Robotham 
et al. U.S. Patent No. 6,160,907 (hereinafter Robotham). 

Claim 1: 

Trumbull teaches a system for producing composite images of real images and computer- 
generated three-dimensional images comprising: 

A real camera configured to generate a series of real images and equipped with one or 
more sensors to record real camera metadata (e.g., camera support system provide signals to a 
motion tracking system for precise synchronization and orientation of the virtual background 
with the recorded foreground image wherein motion tracking includes all of the high resolution 
meta-data, i.e., digital information that describes the actual content of film or video such a s 
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lighting set-up and camera position and a camera motion tracking system receives encoded data 
from a plurality of extremely accurate optical sensors disposed at predetermined locations on 
the camera support system to provide accurate camera lens registration; see column 7, lines 40- 
64; and zoom and focus axes are also encoded using optical sensors mounted on the lens and 
inertial and ultrasonic sensors may be used to track camera orientation and position over 
virtually infinitely scalable range using expandable motion tracking technology; see column 8, 
lines 9-12; moreover, in column 4, lines 25-30, it is stated that <( all recordings include meta- 
data, i. e.. time-code and continuity notes as well as MOCODE data for later camera movement 
matching "), at least one of said sensors being adapted to compute positional and orientation 
coordinates relative to a fixed point ( "a fixed point" is a target 121 in a studio or the actor 110; 
see column 8, lines 1-12 and the position and orientation of the studio camera is then calculated 
using real time analysis of the image to identify each target; see column 8, lines 7-72); 

A metadata alignment device adapted to align said real camera metadata in time (e.g., A 
motion tracking system provides accurate control data to link live-action photography, with the 
computer-generated imagery; see column 4, lines 30-49; and a keying unit 43 detects those parts 
of the camera image that contain the key color and replaces them with the virtual background 
image that is in perspective with the camera point of view to create a composite; column 4, lines 
52-64; and the pattern 51 contains all the data necessary for a computerized pattern recognition 
program to calculate instantaneously all necessary camera and lens parameters, such as, focal 
length of the lens, proximity to the cyclorama, pan, tilt, roll, yaw and the camera 's position on 
the stage in x, y, and z coordinates; to transform the background image to the perspective of the 
camera point of view, the physical parameters of the stage and props, such as the size of the 
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walls, ceilings, floors and furniture, are first determined and stored in a database as numerical 
data. Using the numerical data, and the position and optical parameters of the camera derived 
from a motion tracking system, the virtual background image is transformed using standard 
perspective transformation equations to be in perspective with respect to the orientation of the 
camera. The foreground images are then composited with the accurately aligned computer 
generated background images and transmitted to a camera viewfinder and a non-linear editor 
for immediate editing. Such alignment and composition is performed in real time using a system 
such as the Ultimatte 9, with features specifically designed for virtual studio applications, 
including automatic background defocusing, over-exposure control, color conformance with 
ambience and color controls, edge control, matte sizing, and positioning; see column 5, lines 6- 
32; therefore the cited reference teaches a metadata alignment device), said aligned camera 
metadata being associated with one image frame to form aligned associated camera metadata 
(e.g., the aligned camera metadata are associated with the background image frame); 

A computer system adapted to generate a two-dimensional representation of a pre- 
prepared three-dimensional scene using a virtual camera (Virtual worlds are produced directly 
from a series of still images, i.e., two-dimensional images and a set of images of a scene and 
their corresponding depth images and the image can be re-rendered from any nearby point of 
view by projecting the pixels of the image onto a new image plane in their proper 3D locations; 
see column 8, lines 23-64; here the computer system is adapted to generate the still images and 
all aspects of color, texture, lighting and shape are captured digitally for manipulation in 3D 
computer graphics so that background objects can be altered, moved, re-colored, re-lit or 
otherwise modified based on the two-dimensional representation of a pre-prepared three- 
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dimensional scene) and further being adapted to receive said aligned associated camera 
metadata (e.g., the aligned metadata for the background image were generated by Ultimatte 9) 
and to calibrate said aligned associated camera metadata {see column 10, lines 25-67 and column 
11, lines 1-29, wherein the cited reference teaches the virtual set and its accompanying meta- 
data can be reconfigured and the use of a digitally controlled lighting grid 132 and fixtures 133 
allows for the digital information on the movement, color, intensity, and beam pattern of lighting 
fixtures be determined and transmitted to a digitally controlled lighting unit, to remotely pan, 
tilt, zoom, colorize, dim and barndoor multiple units instantaneously. Therefore, the aligned 
camera metadata are calibrated or changed) against reference tables {against reference 
composites; column 11, lines 30-40) matching said real camera {e.g., AAF provides meta-data 
support including comprehensive source referencing and composition information; column 11, 
lines 54-65), said virtual camera being configured and parameterized with virtual camera 
parameters {the reconfigured meta-data) to simulate said real camera, said virtual camera 
parameters being controlled in real time {time-real compositing), said computer system further 
being adapted to record calibrated camera metadata {because all meta-data is online all the time, 
from pre-production through post-production; column 11, lines 41-46) and to generate said two- 
dimensional representation of said pre-prepared three-dimensional scene using virtual camera 
metadata linked via calibrated camera metadata to the real camera {all meta-data is online all the 
time from pre-production through post-production), producing a series of generated images 
having at least one image quality corresponding with the image quality of the real image 
{rendering photorealistic computer graphics imagery; see column 2, lines 55-58). 
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It is not clear whether Trumbull discloses, "said aligned camera metadata being 
associated with one image frame via a camera time code to form aligned associated camera 
metadata." 

However, Endo discloses the claim limitation of "said aligned camera metadata being 
associated with one image frame via a camera time code to form aligned associated camera 
metadata" (because Endo discloses that the meta-data and the video scene are linked through 
separate files by the common time code and the camera parameters or meta-data are 
modified/aligned for the panoramic image frame via a camera time code to form aligned 
associated camera metadata; see Endo column 12-15). 

It would have been obvious to have incorporated Endo's method of linking the aligned 
camera metadata with an image frame via a time code because Trumbull teaches intuitive 
alignment of live action foreground with the virtual background and to allow various locations, 
compositions, scenes, and complex camera moves to be explored in real-time in a virtual 
environment (Trumbull column 1, lines 60-67). He further teaches the aligned camera metadata 
is linked/referenced to the foreground image frame by keying (Trumbull column 1, lines 50-58) 
or by encoding (Trumbull column 8, lines 1-12) and AAF provides meta-data support including 
comprehensive source referencing and composition information (Trumbull column 1 1, lines 60- 
65) and therefore suggesting using the time code to link the meta-data to the foreground image 
frame (Trumbull column 4, lines 25-30). 

One of the ordinary skill in the art would have been motivated to do time code the meta- 
data and the foreground image frame for later camera movement matching (Trumbull column 4, 
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lines 25-30) and for compositing the images in production or post-production (Trumbull column 

2)- 

It is not clear whether Trumbull and Endo explicitly disclose the claim limitations of 
"real camera", "virtual camera" within the claim limitation of "using a virtual camera and further 
being adapted to receive said aligned associated camera metadata and to calibrate said aligned 
associated camera metadata against reference tables matching said real camera, said virtual 
camera being configured and parameterized with virtual camera parameters to simulate said real 
camera, said virtual camera parameters being controlled in real time, said computer system 
further being adapted to record calibrated camera metadata". 

However, Hoch discloses the claim limitations of "real camera" and "virtual camera" and 
compositing between the virtual and real scene (see Hoch Figs. 5-7), and camera meta-data is 
synchronized with the broadcast video via time code information (Hoch column 8, lines 50-55). 

Moreover, Hock discloses the claim limitations of "using a virtual camera and further 
being adapted to receive said aligned associated camera metadata and to calibrate said aligned 
associated camera metadata against reference tables matching said real camera, said virtual 
camera being configured and parameterized with virtual camera parameters to simulate said real 
camera, said virtual camera parameters being controlled in real time, said computer system 
further being adapted to record calibrated camera metadata" {Hoch discloses in column 5-70 that 
the camera sensor data 12 include some or all the variables for camera position, orientation, 
field of view, pan, tilt, zoom, optical center shift in x, y coordinates and the coefficient of 
distortion and the virtual camera 's viewpoint is set to correspond to that of the real camera 
using the viewpoint node using the viewpoint node with the viewpoint being predefined viewing 
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position and orientation in the virtual world. Therefore, the viewpoint node contains the virtual 
camera parameters to simulate real camera. Hoch further discloses that the real camera 's lens 
distortions are transferred onto the add-on graphics by using the grid geometry 58 to distort the 
add-on graphics to match the distortions of the real camera 10 and the virtual camera f s position 
is correlated to that of the real camera 10 based on the incoming real camera instrumentation 
data 12 and the orientation of the virtual camera 51 is calculated based on the pan, tilt and twist 
of the real camera 10 and the field of view of the virtual camera is set to correspond to that of 
the real camera 10 and the virtual camera 51 is used to render the virtual scene including the 
add-on graphics to be inserted in the real video frame to the corresponding real camera 10 to do 
frame aligned graphics insertion and with lens distortion, real objects may appear aligned with 
the virtual set as the camera pans or zooms and therefore the composited. Hoch discloses that 
the compositor can use a presentation time stamp to present the data at the appropriate time 
during the broadcast and the orientation of the virtual camera is calculated/calibrated matching 
the parameters associated with the real camera and the virtual camera parameters are 
controlled in real time in the sense that the system proceeds to modify the distortion grid based 
on the new distortion parameters shown in blocks 91-93 if the distortion parameters have 
changed and thereby the distortion parameters for the virtual camera are updated in real time 
and the Camera viewpoint node stores the updated distortion parameters). 

It would have been obvious to have incorporated Hoch's rendering of computer graphics 
image with the video image including the parameterized Gridnode for declarative representation 
of the real camera distortion as the virtual camera meta-data (Hoch column 9, lines 38-58) into 
the Trumbull and Endo's system because Trumbull teaches or suggest the claim limitation of 
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"A computer system adapted to generate a two-dimensional representation of a pre- 
prepared three-dimensional scene using a virtual camera (Virtual worlds are produced directly 
from a series of still images, i.e., two-dimensional images and a set of images of a scene and 
their corresponding depth images and the image can be re-rendered from any nearby point of 
view by projecting the pixels of the image onto a new image plane in their proper 3D locations; 
see column 8, lines 23-64; here the computer system is adapted to generate the still images and 
all aspects of color, texture, lighting and shape are captured digitally for manipulation in 3D 
computer graphics so that background objects can be altered, moved, re-colored, re-lit or 
otherwise modified based on the two-dimensional representation of a pre-prepared three- 
dimensional scene) and further being adapted to receive said aligned associated camera 
metadata (e.g., the aligned metadata for the background image were generated by Ultimatte 9) 
and to calibrate said aligned associated camera metadata (see column 10, lines 25-67 and column 
11, lines 1-29, wherein the cited reference teaches the virtual set and its accompanying meta- 
data can be reconfigured and the use of a digitally controlled lighting grid 132 and fixtures 133 
allows for the digital information on the movement, color, intensity, and beam pattern of lighting 
fixtures be determined and transmitted to a digitally controlled lighting unit, to remotely pan, 
tilt, zoom, colorize, dim and barndoor multiple units instantaneously. Therefore, the aligned 
camera metadata are calibrated or changed) against reference tables (against reference 
composites; column 11, lines 30-40) matching said real camera (e.g., AAF provides meta-data 
support including comprehensive source referencing and composition information; column 11, 
lines 54-65), said virtual camera being configured and parameterized with virtual camera 
parameters (the reconfigured meta-data) to simulate said real camera, said virtual camera 
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parameters being controlled in real time {time-real compositing), said computer system further 
being adapted to record calibrated camera metadata {because all meta-data is online all the time, 
from pre-production through post-production; column 11, lines 41-46) and to generate said two- 
dimensional representation of said pre-prepared three-dimensional scene using virtual camera 
metadata linked via calibrated camera metadata to the real camera {all meta-data is online all the 
time from pre-production through post-production), producing a series of generated images 
having at least one image quality corresponding with the image quality of the real image 
{rendering photorealistic computer graphics imagery; see column 2, lines 55-55)." 

One of the ordinary skill in the art would have been motivated to do this to provide 
calibration/correction for the aligned associated camera metadata against the camera metadata of 
an initial frame to do frame aligned graphics insertion (Hoch column 5). 

It is not clear whether Trumbull, Endo and Hoch explicitly disclose "a two-dimensional k 
representation of pre-prepaired three-dimensional scene using a virtual camera". 

However, Robotham discloses "a two-dimensional representation of pre-prepaired three- 
dimensional scene using a virtual camera" (e.g., Robotham column 23, lines 19-67). 

It would have been obvious to have incorporated Robotham's disclosure in Trumbull, 
Endo and Hoch's system because Trumbull suggests the image transformation using perspective 
transformation (Trumbull column 5). 

One of the ordinary skill in the art would have been motivated to allow for the 
presentation of a finish-quality rendering and/or image transformation of one or more objects in 
the choreography model by calculating projections of the objects in the choreography model 
from the virtual stage in which they are defined to a 2D viewing space as specified by a camera 
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object or other viewing object (Robotham column 23 , lines 19-30). Moreover, Robotham further 
discloses updating the metadata in the choreography model 19-16 (column 22, lines 40-53). 
Re Claims 2 and 6: 

Trumbull further discloses optical parameters including camera's position coordinates, 
lens parameters, focal length of the lens (column 5). 
Claim 3: 

Trumbull further discloses a fixed point being connected to a target or an actor (("a fixed 
point" is a target 121 in a studio or the actor 110; see column 5, lines 1-12 and the position and 
orientation of the studio camera is then calculated using real time analysis of the image to 
identify each target; see column 8, lines 1-12). 

Claim 4: 

Trumbull discloses a plurality of encoders that sense the motion ov various elements of 
the camera support system provide signals to a motion tracking system for precise 
synchronization and orientation of the virtual background with the recorded foreground image 
and the camera motion tracking system receives encoded data from a plurality of extremely 
accurate optical sensors disposed at predetermined locations on the camera support system to 
provde accurate camera lens registration (column 7). 

Claim 5: 

Trumbull further discloses in column 4, lines 30-49 a keying unit 43 detects those parts of 
the camera image that contain the key color and replaces them with the virtual background image 
that is in perspective with the camera point of view to create a composite; column 4, lines 52-64; 
and the pattern 51 contains all the data necessary for a computerized pattern recognition program 
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to calculate instantaneously all necessary camera and lens parameters, such as, focal length of the 
lens, proximity to the cyclorama, pan, tilt, roll, yaw and the camera's position on the stage in x, 
y, and z coordinates- to transform the background image to the perspective of the camera point of 
view, the physical parameters of the stage and props, such as the size of the walls, ceilings, floors 
and furniture, are first determined and stored in a database as numerical data. Using the 
numerical data, and the position and optical parameters of the camera derived from a motion 
tracking system, the virtual background image is transformed using standard perspective 
transformation equations to be in perspective with respect to the orientation of the camera. The 
foreground images are then composited with the accurately aligned computer generated 
background images and transmitted to a camera viewfinder and a non-linear editor for immediate 
editing. Such alignment and composition is performed in real time using a system such as the 
Ultimatte 9, with features specifically designed for virtual studio applications, including 
automatic background defocusing, over-exposure control, color conformance with ambience and 
color controls, edge control, matte sizing, and positioning; see column 5, lines 6-32; therefore the 
cited reference teaches a metadata alignment device. 
Reclaims 7-10: 

Robotham discloses "a two-dimensional representation of pre-prepaired three- 
dimensional scene using a virtual camera" (e.g., Robotham column 23, lines 19-67). 
Re Claim 11: 

Hoch discloses in column 5-10 that the camera sensor data 12 include some or all the 
variables for camera position, orientation, field of view, pan, tilt, zoom, optical center shift in x, 
y coordinates and the coefficient of distortion and the virtual camera's viewpoint is set to 
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correspond to that of the real camera using the viewpoint node using the viewpoint node with the 
viewpoint being predefined viewing position and orientation in the virtual world. Therefore, the 
viewpoint node contains the virtual camera parameters to simulate real camera. Hoch further 
discloses that the real camera's lens distortions are transferred onto the add-on graphics by using 
the grid geometry 58 to distort the add-on graphics to match the distortions of the real camera 10 
and the virtual camera's position is correlated to that of the real camera 10 based on the 
incoming real camera instrumentation data 12 and the orientation of the virtual camera 51 is 
calculated based on the pan, tilt and twist of the real camera 10 and the field of view of the 
virtual camera is set to correspond to that of the real camera 10 and the virtual camera 51 is used 
to render the virtual scene including the add-on graphics to be inserted in the real video frame to 
the corresponding real camera 10 to do frame aligned graphics insertion and with lens distortion, 
real objects may appear aligned with the virtual set as the camera pans or zooms and therefore 
the composited. Hoch discloses that the compositor can use a presentation time stamp to present 
the data at the appropriate time during the broadcast and the orientation of the virtual camera is 
calculated/calibrated matching the parameters associated with the real camera and the virtual 
camera parameters are controlled in real time in the sense that the system proceeds to modify the 
distortion grid based on the new distortion parameters shown in blocks 91-93 if the distortion 
parameters have changed and thereby the distortion parameters for the virtual camera are 
updated in real time and the Camera viewpoint node stores the updated distortion parameters. 
Claim 12: 
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Trumbull further discloses pre-production and post-preduction system including 
telecines, editors, switchers, converters and multi-format monitors involving a second computer 
(column 6 and 12). 



Trumbull further discloses reference tables being user-selectable presets for lenses and 
filters (column 7, lines 23-46). 



Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Jin-Cheng Wang whose telephone number is (571) 272-7665. 
The examiner can normally be reached on 8:00 - 6:30 (Mon-Thu). 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Mike Razavi can be reached on (571) 272-7664. The fax phone number for the 
organization where this application or proceeding is assigned is 703-872-9306. 

Information regarding the status of an application may be obtained from the Patent 
Application Information Retrieval (PAIR) system. Status information for published applications 
may be obtained from either Private PAIR or Public PAIR. Status information for unpublished 
applications is available through Private PAIR only. For more information about the PAIR 
system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR 
system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). 
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Conclusion 
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